GenERRate: Generating Errors for Use in Grammatical Error Detection

نویسندگان

  • Jennifer Foster
  • Øistein E. Andersen
چکیده

This paper explores the issue of automatically generated ungrammatical data and its use in error detection, with a focus on the task of classifying a sentence as grammatical or ungrammatical. We present an error generation tool called GenERRate and show how GenERRate can be used to improve the performance of a classifier on learner data. We describe initial attempts to replicate Cambridge Learner Corpus errors using GenERRate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generating artificial errors for grammatical error correction

This paper explores the generation of artificial errors for correcting grammatical mistakes made by learners of English as a second language. Artificial errors are injected into a set of error-free sentences in a probabilistic manner using statistics from a corpus. Unlike previous approaches, we use linguistic information to derive error generation probabilities and build corpora to correct sev...

متن کامل

Grammatical Error Correction of English as Foreign Language Learners

This study aimed to discover the insight of error correction by implementing two correction systems on three Iranian university students. The three students were invited to write four in-class essays throughout the semester, in which their verb errors and individual-selected errors were corrected using the Code Correction System and the Individual Correction System. At the end of the study, the...

متن کامل

Grammatical error prediction

In this thesis, we investigate methods for automatic detection, and to some extent correction , of grammatical errors. The evaluation is based on manual error annotation in the Cambridge Learner Corpus ((((), and automatic or semi-automatic annotation of error corpora is one possible application, but the methods are also applicable in other settings, for instance to give learners feedback on th...

متن کامل

Constrained Grammatical Error Correction using Statistical Machine Translation

This paper describes our use of phrasebased statistical machine translation (PBSMT) for the automatic correction of errors in learner text in our submission to the CoNLL 2013 Shared Task on Grammatical Error Correction. Since the limited training data provided for the task was insufficient for training an effective SMT system, we also explored alternative ways of generating pairs of incorrect a...

متن کامل

Data Driven Grammatical Error Detection in Transcripts of Children's Speech

We investigate grammatical error detection in spoken language, and present a data-driven method to train a dependency parser to automatically identify and label grammatical errors. This method is agnostic to the label set used, and the only manual annotations needed for training are grammatical error labels. We find that the proposed system is robust to disfluencies, so that a separate stage to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009